Did a PhD in English from 2013 and freelanced
Data journalist at The Times and Sunday Times since 2016
Data advisor at Global Witness from next month
Cyber-crime and the dark net
Transparency and open data
Bringing innovative techniques to data journalism
R programming language
Tidyverse family of packages
Elasticsearch and Kibana
Access
Amalgamation (OK, this one’s not great)
Analysis
People talk a lot about creating new data
Another way to think of it: accessing hidden data
Examples: web scraping, using APIs, getting data from PDFs and Word documents, working with data bigger than Excel can handle
2 + 2 = 5
When combined, data is more than the sum of its parts
Examples: joins and fuzzy matching, working with weird file formats, tidying data
How can programming help us see stories?
Visualisation is important, but not the be all and end all
Examples: geospatial analysis, times series analysis, statistical analysis, search
Getting information from a website into structured form
90% of scraping jobs for stories follow this format:
library(tidyverse)
library(rvest)
read_html("https://www.thetimes.co.uk/") %>%
html_nodes(".Item-headline") %>%
html_text()## [1] "Melanie Phillips"
## [2] "Brain fitness"
## [3] "New challenge"
## [4] "The Daily Quiz"
## [5] "Blow for May as backstop risk ‘unchanged’"
## [6] "Second heavy defeat almost certain"
## [7] "May’s new Brexit deal and what it means for the backstop"
## [8] "Fifth letter bomb has yet to be found, claims ‘IRA’"
## [9] "Ten countries ground 737s as safety fears grow"
## [10] "Pilots fear system is flawed"
## [11] "Air pollution ‘kills more people than smoking’"
## [12] "Dirty air causes harm from cradle to the grave"
## [13] "Great British walk that’s the peak of perfection"
## [14] "Spread out the maps, it’s spring at last"
## [15] "Gangs ‘give pupils knives to get them thrown out of school’"
## [16] "One in five women killed by their partners had contacted the police"
## [17] "MI5: Thatcher shielded MP accused of child abuse"
## [18] "Body found in search for British backpacker"
## [19] "BBC under investigation over pay discrimination"
## [20] "TV chef sues over gastropub revamp"
## [21] "Storm Gareth bringing 80mph winds"
## [22] "Housing developer forced to ditch ‘wicked’ tree nets"
## [23] "May claims victory in Brexit backstop talks"
## [24] "PM dashes to Strasbourg on a wing and a prayer to save Brexit"
## [25] "EU planning for year-long extension, diplomats say"
## [26] "Raab leadership bid opens with pledge on social mobility"
## [27] "PM takes flight to be far from the maddest crowd"
## [28] "She’s gone and done it - but will it be enough?"
## [29] "Cabinet row over cash to stop wave of knife crime"
## [30] "Rebel group draws a third of Labour MPs"
## [31] "Family of Isis bride pleads with Javid to show mercy"
## [32] "More abuse, fewer arrests: the figures from stretched police"
## [33] "A sixth of London homicides are linked to violence against women"
## [34] "Victim calls for register of abusers"
## [35] "New approach stretches beyond crime to culture"
## [36] "Patients face longer A&E waits as targets scrapped, doctors warn"
## [37] "More work needed to convince patients"
## [38] "Captain Marvel fights off online trolls with $455m opening weekend"
## [39] "Nun’s letter shakes up history of Lisbon earthquake"
## [40] "Honour for British UN worker killed in Boeing crash"
## [41] "Mystery of what happened on flight 302 could soon be revealed"
## [42] "China bans use of jet amid fears over safety"
## [43] "Lucky passenger was turned away at gate"
## [44] "Stabiliser may have tipped fatal flight out of pilots’ control"
## [45] "Marine who lost leg rows Atlantic in record time"
## [46] "Solar storms could cripple modern life"
## [47] "For perfect family TV, bring back The Bill"
## [48] "Alexa, popcorn and bakeware enter the great British basket"
## [49] "School offers PPE course to inspire girls into politics"
## [50] "Eye test could spot Alzheimer’s earlier"
## [51] "Digital ‘friends’ help students get to lessons"
## [52] "Sins of the (Time) Lord"
## [53] "In pictures: Commonwealth Day"
## [54] "Heathrow extension ‘will be the size of another Gatwick’"
## [55] "Policewoman took £34,000 in gifts from lonely widower, 87"
## [56] "Tears of teenage boy in court over Jodie stabbing"
## [57] "Fan jailed for punching player on pitch"
## [58] "Who likes the Independent Group and where do they stand on policy?"
## [59] "I’ll vote against this deal and face Brexit voters at the ballot box"
## [60] "May has one last escape route from this sorry saga"
## [61] "Hammond has a golden chance to build for the homeless"
## [62] "News in pictures"
## [63] "EU waits to see winner of this power struggle"
## [64] "No Phoebe, violence doesn’t empower women"
## [65] "Zuckerberg’s view of privacy is self-serving"
## [66] "Advice for a soft lad from his big lad friend"
## [67] "Face-offs that start with a peach can end with a brick"
## [68] "Four-hour target for A&E patients has had its day"
## [69] "Avoidable Damage"
## [70] "Flight Hazards"
## [71] "Chronicler of Catastrophe"
## [72] "Call for tighter control of motorbike emissions"
## [73] "Nature notes"
## [74] "Birthdays today"
## [75] "New Assad statue provokes wave of protests"
## [76] "Iraq agrees to take back all 20,000 of its jihadis"
## [77] "US set to eclipse Saudi Arabia as world’s biggest oil exporter"
## [78] "President of Algeria abandons re‑election bid"
## [79] "Syria’s fate in hands of world leaders jockeying for power"
## [80] "Guaidó pleads for foreign help to solve power crisis"
## [81] "German gnome industry on its knees"
## [82] "Gardener targeted clients from beyond grave"
## [83] "Democrats threaten Trump border wall"
## [84] "Fundraiser sold access to White House"
## [85] "Secret burial for Göring’s daughter, 80"
## [86] "Five Star backs down in vaccination row"
## [87] "King with 14 wives loses his second queen in a year"
## [88] "28 years on, Ethiopian ‘killers’ still in embassy"
## [89] "Woman accused of poisoning Kim’s half brother is freed"
## [90] "Indian poll row as Modi uses captured pilot in campaigning"
## [91] "Salvini rejects Saudi backing for La Scala"
## [92] "Traditional cheesemakers kick up a stink with MPs"
## [93] "Man scoops $273m lottery on ticket he left at the till"
## [94] "Turkey says speculation led to slump"
## [95] "Chechen activist faces penal colony after ‘fake’ drugs bust"
## [96] "The ‘happy house’ that is a misery for sellers"
## [97] "Cox’s view of backstop concessions hits pound"
## [98] "UK economy rebounds in January"
## [99] "You can be better off outside EU, says man tipped to replace Carney"
## [100] "Feathers fly after RSPB shoots down Kestrel sponsorship"
## [101] "Ryanair will ban Britons from buying shares"
## [102] "More powerful regulator to replace under-fire FRC"
## [103] "Someone will game even the fairest pay rules . . . you can bank on that"
## [104] "Barrick’s tale of the unexpected"
## [105] "Suitors show interest in G4S’s cash-handling business"
## [106] "Resilient economy boosts banks, housebuilders and retailers"
## [107] "888 Holdings spreads its bets"
## [108] "Doubts expressed over Gilbert’s role at Revolut"
## [109] "Battle ends with gold miners on same side"
## [110] "British Land chairman is heading for the door"
## [111] "Van hire boss faces crunch meeting after investor’s move"
## [112] "Corporate confidence falls to new low"
## [113] "Financial sector ‘to start cutting jobs’"
## [114] "Jeans maker Levi fashions a return to the market"
## [115] "Ex-Superdry chief not welcome back, bosses tell investors"
## [116] "Accounting error increases Kier’s debt"
## [117] "No-deal is a stepping stone to breaking free from too many tariffs"
## [118] "Sterling rises on May deal"
## [119] "Central banks need to ease up on QE, warns the man who anticipated crisis"
## [120] "Igas turns up heat on shale gas exploration"
## [121] "WPP signs Microsoft chief to help battle with online giants"
## [122] "Cairn Energy still waiting for Indian tax row ruling"
## [123] "Law firm breaks new ground with £366m flotation"
## [124] "Clarkson sinks after warning of gloom ahead"
## [125] "Unilever chief paid €11.7m despite failed restructuring"
## [126] "Provident rubbishes rival’s ‘misleading’ takeover bid"
## [127] "Interserve lenders ready to sweeten rescue deal"
## [128] "Stout approach is not about to change"
## [129] "HMV rescue hits flat note for creditors"
## [130] "Bank puts two former army colleagues in the front line"
## [131] "Companies should be allowed to choose how best to deliver justice"
## [132] "Your three-minute digest"
## [133] "The beauty of manufacturing and exporting is more than skin deep"
## [134] "Buveur D’Air and Apple’s Jade ready to serve up classic"
## [135] "Champion tipster Rob Wright’s best bets for day one of Cheltenham"
## [136] "Conte payoff row with Chelsea escalates"
## [137] "Zidane faces a completely different test in second coming: the rebuild of Real"
## [138] "Public will inflict proper punishment, not courts"
## [139] "Grealish attacker jailed for 14 weeks as FA seeks talks over player safety"
## [140] "‘I couldn’t sleep for days – I’ve reached out to Jack’"
## [141] "Lions should give Gatland the job now"
## [142] "Times Sport Unseen: the best of our photographers’ pictures this week"
## [143] "Collective punishment will not solve crowd trouble"
## [144] "How is security at football grounds supposed to work?"
## [145] "City still teenagers in Europe even if we win it, says Guardiola"
## [146] "Nine months on, Zidane is back to rescue ailing Real"
## [147] "Art meets science: how a horse jumps"
## [148] "From birth to Cheltenham: the making of a contender"
## [149] "Rising stars, party bars and the year of women jockeys"
## [150] "Focus on safety like never before"
## [151] "The big questions this week"
## [152] "The Game Dissected: how Pérez is spearheading Newcastle’s survival bid"
## [153] "The key owners to follow at the Cheltenham Festival"
## [154] "Times Sport Dissects: How to win the Gold Cup"
## [155] "Van Dijk: we need to find rare resilience"
## [156] "Gündogan’s exit hint gives City dilemma"
## [157] "‘I used to be spiteful but I have grown soft. A bit of me has gone’"
## [158] "England pile on pressure as Wales bid for grand slam"
## [159] "Top European clubs fight Nations League"
## [160] "Heavy rollers and flatter seams aimed at giving batsmen a fair crack"
## [161] "Bayliss gives Archer no guarantees"
## [162] "Picasso of the baize finding new art forms"
## [163] "Pakistan demand action on India caps"
## [164] "Trippier faces battle to be fit for England squad"
## [165] "City to compensate Bennell abuse victims"
## [166] "Silva charged over referee rant"
## [167] "Bolton wages still unpaid"
## [168] "Olympic medallist Ogogo forced to retire"
## [169] "Djokovic ‘snubbed’ Federer over Kermode contract meeting"
## [170] "Champion tipster of the year Rob Wright’s racing tips"
## [171] "A master at work: Britain’s greatest sports photographer’s best pictures"
## [172] "Wallace Broecker"
## [173] "Tony Pike"
## [174] "Vanda Salmon"
## [175] "Lives remembered"
## [176] "March 11"
## [177] "Tempestuous weather during ‘clanging arch of steel-grey March’"
## [178] "The disarmament of Germany"
## [179] "Crossword Club"
## [180] "Times Concise No 7910"
## [181] "Times Quick Cryptic No 1306"
## [182] "Times Cryptic No 27296"
## [183] "Concise Quintagram No 321"
## [184] "Cryptic Quintagram No 321"
## [185] "Sudoku No 10558 Super fiendish"
## [186] "Sudoku No 10557 Difficult"
## [187] "Sudoku No 10556 Mild"
## [188] "Killer Sudoku No 6480 Tough"
## [189] "Killer Sudoku No 6479 Moderate"
## [190] "Brain Trainer No 2825"
## [191] "Cell Blocks No 3477"
## [192] "Codeword No 3594"
## [193] "Futoshiki No 3387"
## [194] "Kakuro No 2346"
## [195] "KenKen No 4586"
## [196] "Lexica No 4694"
## [197] "Lexica No 4693"
## [198] "Polygon"
## [199] "Set Square No 2349"
## [200] "Suko No 2495"
## [201] "Bridge"
## [202] "Chess"
## [203] "Age-proof your brain"
## [204] "Angie Thomas — the new queen of teen fiction — on race, poverty, TV comedy and Harry Potter"
## [205] "Do 1 in 7 students really cheat?"
## [206] "Dr Mark Porter: Blood pressure treatment is on the rise. Here’s why"
## [207] "Feeling the burn? Here’s what you can do about acid reflux"
## [208] "Three ways to prepare for the hay fever season"
## [209] "Food fight: cashews v almonds"
## [210] "Robert Crampton: You say ‘I’m a free spirit’, but you’re just trying to avoid taking the bins out"
## [211] "J-Lo and A-Rod"
## [212] "Victoria at the Grand Theatre, Leeds"
## [213] "The Hold Steady at the Electric Ballroom, NW1"
## [214] "Lizz Wright at the Queen Elizabeth Hall"
## [215] "London Symphony Orchestra/ Haitink at the Barbican"
## [216] "National Dance Company Wales at the Linbury, Royal Opera House"
## [217] "The Times Daily Quiz"
## [218] "TV review: Cheat; The Choir — Our School by the Tower"
## [219] "What’s on TV tonight"
## [220] "Lindsey Bareham’s mussels with bacon"
## [221] "Captain Marvel"
## [222] "Disinfectants ‘fuelling rise of MRSA on patient wards’"
## [223] "‘Final justice’ for victims as serial killer dies in jail"
## [224] "Rail union threatens carnage if members denied drivers’ deal"
## [225] "Observatory razed in fire to be rebuilt"
## [226] "Antisemitism row as ex-MP shares offensive cartoon"
## [227] "Calls grow for ban on sex-for-rent ads"
## [228] "Labour offers no more than gesture politics"
## [229] "New channel has replaced that whiff of the kailyard"
## [230] "Jim Arnold"
## [231] "Lack of choice in school subjects hits job prospects"
## [232] "Forgive me, begs man jailed for deliberately spreading HIV"
## [233] "Politicians are wildly out of touch on smacking ban"
## [234] "Academic ‘said he would help student if he could spank him’"
## [235] "Scots prefer independence to no-deal Brexit or May’s plan"
## [236] "Cheating in exams on rise"
## [237] "13-year-old boy stabbed during school lunch break"
## [238] "Confidence among Scottish companies reaches new low"
## [239] "Scots hit by McGregor’s retirement"
## [240] "Morelos told not to hold back tonight"
## [241] "Jack: abuse from fans in Aberdeen is a compliment"
## [242] "McInnes: we haven’t missed our chance"
## [243] "Kilmarnock winning in the rain"
## [244] "Russell: we’re still trying to see positives"
## [245] "Brexit backstop fears can be put to bed, Varadkar insists"
## [246] "Boeing crash victim ‘wanted to save the world’"
## [247] "UN warned it would raise alarm over princess’s fate"
## [248] "Young homeowners would be hardest hit by ECB rate rise"
## [249] "‘Brave’ cervical cancer campaigner helps to boost HPV vaccine uptake"
## [250] "Nurses head back to Labour Court in row over contract"
## [251] "Beast O’Driscoll ‘killed in street after IRA dispute’"
## [252] "Alleged Omagh bombers declared bankrupt"
## [253] "Farmer accused of murder asked gardaí how DJ died"
## [254] "General feels ‘sympathy’ after Ballymurphy killings"
## [255] "NUI professor on a mission to explore the Indian Ocean"
## [256] "Isis bride will not be stripped of citizenship, says Varadkar"
## [257] "IRA victims condemn honour for McGuinness"
## [258] "Online plan to help clarify payments for child welfare"
## [259] "Children’s trains halted as insurance crisis takes toll on small businesses"
## [260] "‘Damaging gusts’ expected as Storm Gareth blows in"
## [261] "90% of health and school contracts going over budget"
## [262] "Store raider used trollies to halt gardaí"
## [263] "Sinn Féin to reject higher carbon tax"
## [264] "Westminster ‘should end’ Northern Ireland abortion ban"
## [265] "Mullein arrive with unmistakable clumps of grey-green, felted leaves"
## [266] "Macron’s vision for Europe has a blind spot"
## [267] "Crying wolf about Brexit has made us fail to prepare"
## [268] "Ireland can no longer ignore Islamist threat"
## [269] "The Wrong Home"
## [270] "Irish mortgage rates will be expensive ‘for years to come’"
## [271] "Dublin the big winner of ‘Brexodus’"
## [272] "Short-term notes set to boost coffers"
## [273] "Software firm in Canadian breakthrough"
## [274] "Schmidt: we need to maintain top level to win"
## [275] "Pro14 to carry on during the World Cup"
## [276] "Ireland still face some big questions"
## [277] "How Schmidt’s arrival signalled the start of France’s worst nightmare"
## [278] "Veterans are crucial to giving Wales a run for their money"
## [279] "Sayeh still breaking down barriers after escape from Liberia"
## [280] "Division 1 proving that there really is no place like home"
## [281] "Intercounty demands leave players with no life – Scallan"
API: application programming interface
Structured way for programs to talk to each other
Lots of organisations provide them: private sector and public sector
Endpoint: “/Published/Notices/OCDS/Search”
Parameter: “stages=award”
Parameter: “order=ASC”
Parameter: “page=1”
–
Live demo! https://www.zap-map.com/live/